See the file "extensions.txt" about design of Plugins/Extensions.


ENCODING AND CHARACTERS

We deal with UTF-8 encoding.  If you want to use some other encoding for your wiki software, then JCreole is not the tool for you.  Similarly, we "write" line delimiters only as \n.  We never write \r's.
As JCreole output will usually be combined with other material to generate HTML page, you should definitely do a '.replace("\n", "\r\n")' if the other content uses DOS line separators.

You must either supply a stream that has no control characters other than \r and tabs (\t) (... and this means no \r carriage returns), or use the new* static factory methods to strip them from your input.  It would multiplied the complexity of the scanner greatly to have to accommodate the other characters.
If you will be serving huge documents like books, you may instantiate your own Reader filter to assure this on your own, but it could be difficult to do so without losing the benefit of low-level caching.
(If you do implement a char-by-char filtering Reader that is high performance and has no capacity limit, send it to me and I'll incorporate it and remove the limitations described here).


DIRECT HTML

Users may code <, ", etc. in their Creole input, of course, and those characters will be translated to &lt; and &quot;, etc.
For security reasons, the only way that a Creole document author can get the real characters <, ", etc., directly into output is by using the <<~...>> plugin supplied for that purpose.


CHARACTER ENTITIES

Entities of the form &[a-zA-Z0-9]+; are passed from Creole input to output unadulterated.
(That expression is in Regexp. format, not BNF).
See the values in the "Entity Name" columns in the tables at http://www.w3schools.com/tags/ref_entities.asp for the many entities that you can use.
Considering whether to allow escaping this behavior like "~&lt;" to have output write "&amp;lt;", but not doing so for now since that would violate the Creole 1.0 spec.
